AITopics

2511.13341

Country: Asia > China (0.14)

Genre:

Workflow (0.69)
Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Mouchamps, Antoine, Malherbe, Arthur, Bolland, Adrien, Ernst, Damien

Gym-TORAX: Open-source software for integrating RL with plasma control simulators

arXiv.org Artificial IntelligenceOct-14-2025

This paper presents Gym-TORAX, a Python package enabling the implementation of Reinforcement Learning (RL) environments for simulating plasma dynamics and control in tokamaks. Users define succinctly a set of control actions and observations, and a control objective from which Gym-TORAX creates a Gymnasium environment that wraps TORAX for simulating the plasma dynamics. The objective is formulated through rewards depending on the simulated state of the plasma and control action to optimize specific characteristics of the plasma, such as performance and stability. The resulting environment instance is then compatible with a wide range of RL algorithms and libraries and will facilitate RL research in plasma control. In its current version, one environment is readily available, based on a ramp-up scenario of the International Thermonuclear Experimental Reactor (ITER).

machine learning, reinforcement learning, simulator, (15 more...)

2510.11283

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Jiang, Yuancheng, Yap, Roland, Liang, Zhenkai

OSS-Bench: Benchmark Generator for Coding LLMs

arXiv.org Artificial IntelligenceMay-21-2025

In light of the rapid adoption of AI coding assistants, LLM-assisted development has become increasingly prevalent, creating an urgent need for robust evaluation of generated code quality. Existing benchmarks often require extensive manual effort to create static datasets, rely on indirect or insufficiently challenging tasks, depend on non-scalable ground truth, or neglect critical low-level security evaluations, particularly memory-safety issues. In this work, we introduce OSS-Bench, a benchmark generator that automatically constructs large-scale, live evaluation tasks from real-world open-source software. OSS-Bench replaces functions with LLM-generated code and evaluates them using three natural metrics: compilability, functional correctness, and memory safety, leveraging robust signals like compilation failures, test-suite violations, and sanitizer alerts as ground truth. In our evaluation, the benchmark, instantiated as OSS-Bench(php) and OSS-Bench(sql), profiles 17 diverse LLMs, revealing insights such as intra-family behavioral patterns and inconsistencies between model size and performance. Our results demonstrate that OSS-Bench mitigates overfitting by leveraging the evolving complexity of OSS and highlights LLMs' limited understanding of low-level code security via extended fuzzing experiments. Overall, OSS-Bench offers a practical and scalable framework for benchmarking the real-world coding capabilities of LLMs.

benchmark, large language model, machine learning, (16 more...)

2505.12331

Genre: Research Report > New Finding (0.86)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

The Atlantic - TechnologyJan-4-2024, 20:55:33 GMT

There Was Never Such a Thing as 'Open' AI

At the turn of the century, when the modern web was just emerging and Microsoft was king, a small but growing technology movement posed an existential threat to the company. Steve Ballmer, Microsoft's CEO at the time, called one of its core elements "a cancer that attaches itself" to "everything it touches." The disease was a competing operating system, Linux, and the open-source software it represented: programs that were free for anyone to download, modify, and use, in contrast to expensive, proprietary software such as Microsoft Windows and Office. Open-source software did eventually attach itself to much of the internet--Mozilla Firefox, the Android operating system, and Wikipedia are all "open" projects--but the tech industry managed to turn the egalitarian philosophy into a business opportunity. Trillion-dollar companies use free open-source software to build or enhance their own products.

large language model, llama 2, machine learning, (21 more...)

The Atlantic - Technology

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > California (0.05)

Industry: Information Technology (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Communications > Social Media (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.81)
(2 more...)

arXiv.org Artificial IntelligenceOct-20-2023

Tuna: Instruction Tuning using Feedback from Large Language Models

Li, Haoran, Liu, Yiran, Zhang, Xingxing, Lu, Wei, Wei, Furu

Instruction tuning of open-source large language models (LLMs) like LLaMA, using direct outputs from more powerful LLMs such as Instruct-GPT and GPT-4, has proven to be a cost-effective way to align model behaviors with human preferences. However, the instruction-tuned model has only seen one response per instruction, lacking the knowledge of potentially better responses. In this paper, we propose finetuning an instruction-tuned LLM using our novel \textit{probabilistic ranking} and \textit{contextual ranking} approaches to increase the likelihood of generating better responses. Probabilistic ranking enables the instruction-tuned model to inherit the relative rankings of high-quality and low-quality responses from the teacher LLM. On the other hand, learning with contextual ranking allows the model to refine its own response distribution using the contextual understanding ability of stronger LLMs. Furthermore, we apply probabilistic ranking and contextual ranking sequentially to the instruction-tuned LLM. The resulting model, which we call \textbf{Tuna}, consistently improves the performance on Super Natural Instructions (119 test tasks), LMentry (25 test tasks), Vicuna QA, and can even obtain better results than several strong reinforcement learning baselines. Our code and data are available at \url{ https://github.com/microsoft/LMOps}.

instruction, ranking data, software, (15 more...)

2310.13385

Country: Asia > Singapore (0.05)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-5-2023

PeaTMOSS: Mining Pre-Trained Models in Open-Source Software

Jiang, Wenxin, Jones, Jason, Yasmin, Jerin, Synovic, Nicholas, Sashti, Rajeev, Chen, Sophie, Thiruvathukal, George K., Tian, Yuan, Davis, James C.

Developing and training deep learning models is expensive, so software engineers have begun to reuse pre-trained deep learning models (PTMs) and fine-tune them for downstream tasks. Despite the wide-spread use of PTMs, we know little about the corresponding software engineering behaviors and challenges. To enable the study of software engineering with PTMs, we present the PeaTMOSS dataset: Pre-Trained Models in Open-Source Software. PeaTMOSS has three parts: a snapshot of (1) 281,638 PTMs, (2) 27,270 open-source software repositories that use PTMs, and (3) a mapping between PTMs and the projects that use them. We challenge PeaTMOSS miners to discover software engineering practices around PTMs. A demo and link to the full dataset are available at: https://github.com/PurdueDualityLab/PeaTMOSS-Demos.

mining pre-trained model, open-source software, peatmoss

2310.0362

Genre: Research Report (0.40)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

#artificialintelligenceJan-6-2023, 02:05:46 GMT

Open Source Can Leverage Artificial Intelligence, Here Is How

Researchers and developers working on AI projects may find it easier to use open source software because it is typically less expensive to use than proprietary software. This may help to lower the price of creating AI solutions, which may boost the field's advancement. Over the past few years, the importance of open source software in the realm of AI has increased significantly. Open source software has several advantages, one of which is the possibility for programmers to work together and exchange information. AI developers can build on the work of others and share their own contributions by adopting open-source software, which promotes innovation and growth in the field of AI more quickly. Since a result, the subject may advance more quickly as programmers can collaborate and benefit from one another's contributions.

artificial intelligence, open source software, software, (12 more...)

Industry: Law > Intellectual Property & Technology Law (0.59)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceDec-27-2022, 00:30:22 GMT

How Can Open Source Software Advance Progress Of Artificial Intelligence?

In Part 1 of this article, I wrote about how Artificial intelligence (AI) can advance open-source software. But is the converse true as well? Can the open-source world advance the progress of AI? Let's explore this reverse angle. The role of open-source software in AI has become increasingly important over the past few years. One of the primary benefits of open-source software is the ability for developers to collaborate and share knowledge.

developer, open-source software, tool and resource, (10 more...)

Industry: Information Technology (0.50)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.33)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.33)

#artificialintelligenceDec-26-2022, 09:05:11 GMT

How Can Artificial Intelligence Advance Open Source Software?

Artificial intelligence (AI) has been making waves in various industries for its potential to revolutionize the way we work and live. One area where AI has made significant strides is in the realm of open-source software. Open source software refers to software whose source code is available for anyone to view, modify, and distribute. This collaborative approach to software development has been embraced by developers around the world, and has resulted in a vast array of high-quality software that is free to use. The role of AI in open-source software can be divided into two main categories: development and use. In terms of development, AI can help automate the process of producing, writing, and testing code.

open source software, open-source software, software, (7 more...)

Industry: Information Technology (0.33)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.54)
Information Technology > Artificial Intelligence > Natural Language (0.52)

#artificialintelligenceSep-14-2022, 18:17:32 GMT

What are data scientists' biggest concerns? The 2022 State of Data Science report has the answers

To further strengthen our commitment to providing industry-leading coverage of data technology, VentureBeat is excited to welcome Andrew Brust and Tony Baer as regular contributors. Data science is a quickly growing technology as organizations of all sizes embrace artificial intelligence (AI) and machine learning (ML), and along with that growth has come no shortage of concerns. The 2022 State of Data Science report, released today by data science platform vendor Anaconda, identifies key trends and concerns for data scientists and the organizations that employ them. Among the trends identified by Anaconda is the fact that the open-source Python programming language continues to dominate the data science landscape. Among the key concerns identified in the report was the barriers to adoption of data science overall.

biggest concern, data science report, data scientist, (9 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.16)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Software > Programming Languages (0.56)
Information Technology > Artificial Intelligence > Machine Learning (0.38)